Processing Large Datasets with the CSV Module

This notebook illustrates how the csv module can be used to incrementally process a large subset of OpenNEX DCP30 data in Python.

In this example, we'll identify the minimum and maximum temperatures in the continental United States as predicted by the CESM1-CAM5 climate model under the RCP4.5 scenario. The data used in this example is available at http://opennex.planetos.com/dcp30/LpJMh.


Import Required Modules

Let's begin by importing the required modules. We'll need both csv and urllib2 to request and load our data.


In [17]:
import csv
import urllib2

pr_min_max

The pr_min_max function iterates through a dataset containing the average high and lows (tasmax and tasmin variables) for each location and prints the location and month of the highest high and lowest low.


In [18]:
def pr_min_max(ip_addr):
    mintemp = {'Value': 1000.0}
    maxtemp = {'Value': 0.0}

    cr = csv.DictReader(urllib2.urlopen("http://%s:7645/data.csv" % ip_addr))

    for row in cr:
        temp = float(row['Value'])
        var = row['Variable']
        if var == 'tasmax' and temp > float(maxtemp['Value']):
            maxtemp = row
        if var == 'tasmin' and temp < float(mintemp['Value']):
            mintemp = row

    print "The minimum temperature is %.2f degrees C on %s at (%.3fW, %.3fN)" % \
        (float(mintemp['Value'])-273.15, mintemp['Date'][:7], \
        -float(mintemp['Longitude']), float(mintemp['Latitude']))
    print "The maximum temperature is %.2f degrees C on %s at (%.3fW, %.3fN)" % \
        (float(maxtemp['Value'])-273.15, maxtemp['Date'][:7], \
        -float(maxtemp['Longitude']), float(maxtemp['Latitude']))

Analyze the Data

Let's run the pr_min_max function on our dataset. Note that the IP address used below may differ from your data server deployment.


In [20]:
# Note: replace with the IP address of your data server
pr_min_max("192.168.99.100")


The minimum temperature is -26.80 degrees C on 2016-12 at (107.179W, 44.379N)
The maximum temperature is 50.46 degrees C on 2016-07 at (116.896W, 36.529N)

Results

For the year 2016, the CESM1-CAM5 climate model under the RCP4.5 scenario predicts that the lowest temperature in the continental US will be -26.80 °C (-16.24 °F), occuring in December in Wyoming. The highest temperature is predicted to be 50.46 °C (122.83 °F), occuring in July in Death Valley, CA.